### A Pluto.jl notebook ###
# v0.20.4

using Markdown
using InteractiveUtils

# ╔═╡ d469c921-1c1b-49b7-bbad-9bfb1e53fbd4
import Pkg; Pkg.activate(".")

# ╔═╡ 3c2986d8-2379-4cd5-a747-435a5180076f
begin
	using CairoMakie
	using CommonMark
	using Format
	using PlutoUI
end

# ╔═╡ 72a2571d-dc92-41d6-b73b-ba9e985ca4f1
md"""
**What is this?**


*This jupyter notebook is part of a collection of notebooks on various topics discussed during the Time Domain Astrophysics course delivered by Stefano Covino at the [Università dell'Insubria](https://www.uninsubria.eu/) in Como (Italy). Please direct questions and suggestions to [stefano.covino@inaf.it](mailto:stefano.covino@inaf.it).*
"""

# ╔═╡ 12200cd2-5e1e-4d96-9284-60fb950fd70a
md"""
**This is a `Julia` notebook**
"""

# ╔═╡ ca916ac4-71f2-4b58-878c-3e40cf341b46
Pkg.instantiate()

# ╔═╡ a81e5916-c0e7-410d-8c6e-917303ab087a
md"""
$(LocalResource("Pics/TimeDomainBanner.jpg"))
"""

# ╔═╡ 17903372-3749-4bfc-9519-7181bc0539f1
md"""
# Time of Arrival analysis

***

- So far, we have assumed that the our data are constituted by a set of $N$ data points $(t_j,y_j), j = 1,...,N$, possibly with known errors for $y$.

- We can anyway think to different datasets. For instance, at X-ray and shorter wavelengths, individual photons are detected and background contamination is often negligible. 
    - In such cases, the data set consists of the arrival times of individual photons $t_j, j=1,…,N$, or, in principle and more generally, the time of occurrence of a given phenomenon.

- Given such a data set, how do we search for a periodic signal, and more generally, how do we test for any type of variability?

"""

# ╔═╡ 0e4c318a-511c-4646-9b33-634e5cb87c6d
cm"""
## Rayleigh Test
***

- The best known classical test for variability in arrival time data is the *Rayleigh test*.

- Given a trial period, ``P``, the phase, ``φ_j``, corresponding to each datum is evaluated using: 

```math
\varphi = \frac{t}{P} - {\rm int} \left( \frac{t}{P} \right) 
```

- And the following statistics is computed:

```math
R^2 = \left( \sum_{j=1}^N \cos(2 \pi \varphi_j) \right)^2 + \left( \sum_{j=1}^N \sin(2 \pi \varphi_j) \right)^2 
```

- A simple way to read this expression is referring to a random walk.
    - Each angle ``φ_j`` defines a unit vector, and ``R`` is the length of the resulting vector. 

- For random data ``R^2`` is small, and for periodic data ``R^2`` is large when the correct period is chosen.

- ``R^2`` is evaluated for a grid of periods, and the best period is chosen as the value that maximizes ``R^2``.

- For ``N > 10``, ``2R^2/N`` is distributed as ``\chi^2`` with two degrees of freedom. This easily follows from the random walk interpretation.

- Then, one can assess the significance of the best-fit period as the probability that “a value that large” would happen by chance when the signal is stationary.

$(LocalResource("Pics/rayleigh.png"))

- This plot (from [Weigt et al. 2019](https://ui.adsabs.harvard.edu/abs/2019EPSC...13.1660W/abstract)) shows a time-series of event times, the Rayleigh periodogram, and simulations to infer the significance of the periodogram maximum. 



- It is also possible to generalize the Rayleigh test including multiple harmonics in the time of arrival decomposition:

```math
R_n^2 = \frac{2}{N} \sum_{k=1}^n \left[\left( \sum_{j=1}^N \cos(k \varphi_j) \right)^2 + \left( \sum_{j=1}^N \sin(k \varphi_j) \right)^2 \right] 
```

- ``n`` is of course the number of harmonics.
"""

# ╔═╡ 39c8c2cb-e6f6-4d74-8792-ba96e057744a
cm"""
### Exercize: DO-climate- events
***

- In the paper ([Ditlevsen et al. 2006](https://ui.adsabs.harvard.edu/abs/2006AGUFMGC24A..07D/abstract)), a possible periodicity of about 1500 years for the Dansgaard-Oeschger (DO) events,observed in the Greenland ice cores is discussed.

- According to Wikipedia, [Dansgaard–Oeschger events](https://en.wikipedia.org/wiki/Dansgaard%E2%80%93Oeschger_event) are rapid climate fluctuations that occurred during the last glacial period. Some scientists say that the events occur quasi-periodically with a recurrence time being a multiple of 1,470 years, but this is debated.

$(LocalResource("Pics/doevents.png"))

- The δ<sup>18</sup>O isotope recors from NGRIP and GISP on their stratigraphic time scale. The vertical bars are separated by 1470 years. The analysis focus on the well defined fast onsets of DO events, which are transitions from the stadial to the interstadial states. Ages are b2k=BP+50 years.

- B.P. (Before the Present) is the number of years before the present. Because the present changes every year, archaeologists, by convention, use A.D. 1950 as their reference. So, 2000 B.P. is the equivalent of 50 B.C.

- In geochemistry, paleoclimatology and paleoceanography δ<sup>18</sup>O or delta-O-18 is a measure of the ratio of stable isotopes oxygen-18 (<sup>18</sup>O) and oxygen-16 (<sup>16</sup>O). It is commonly used as a measure of the temperature of precipitation, as a measure of groundwater/mineral interactions, and as an indicator of processes that show isotopic fractionation, like methanogenesis. In paleosciences, <sup>18</sup>O:<sup>16</sup>O data from corals, foraminifera and ice cores are used as a proxy for temperature.
"""

# ╔═╡ b5dc35e4-0f85-468c-8f59-4ac0b9b00b3e
md"""
- Identifiying a DO event is not an easy task, and there are discussions about the actual definition of what a "rapid fluctuation" is. Of course we do not enter in this discussion, and adopt the list reported by [Rahmstorf (2003)](https://ui.adsabs.harvard.edu/abs/2003GeoRL..30.1510R/abstract).
"""

# ╔═╡ 93f9b75c-bd43-4229-9ddd-faba108afe65
DOevts = [11605, 13073, 14630, 23398, 27821, 29021, 32293, 33581, 35270, 38387, 41143, 42537, 45362];

# ╔═╡ 5deba4b4-03ca-40f6-a6f7-aa17c2dc7eaf
md"""
- Let's now code a function to compute the Rayleigh periodogram:
"""

# ╔═╡ 93d67ed4-3811-414d-a86f-6e951d5e3dbf
function z2n(freqs, time; harm=1)
    N = length(time)
    Z2n = []
    for ni in freqs
        aux = 0
        for k in 1:harm
            Phi = mod.(ni .* time,1)
            arg = k .* Phi*2.0*π
            phicos = cos.(arg)
            phisin = sin.(arg)
            aux = aux .+ (sum(phicos)^2 + sum(phisin)^2)
        end
        push!(Z2n,(2.0/N)*aux)
    end
    return Z2n
end

# ╔═╡ 14175fbe-5aea-4fe0-b9b1-f99049d9cdde
md"""
- We study possible periods from 100 to 10000 years.
"""

# ╔═╡ 69256ec4-c8b2-4110-896f-ec8bfa01d662
begin
	per = range(start=100,stop=10000,step=100)
	freq = 1 ./ per
	res = z2n(freq,DOevts)
end;

# ╔═╡ 3cf771fc-30d2-4ded-b17e-f6b66a35d218
begin
	fg1 = Figure()
	
	ax1fg1 = Axis(fg1[1, 1],
	    xlabel="Period (year)",
	    ylabel=L"R$^2$",
	    )
	
	lines!(per,res,label="Rayleigh periodogram")
	
	axislegend()
	
	fg1
end

# ╔═╡ 3b121627-05b3-4238-9901-441182885848
md"""
- Given that with large N (and indeed this is NOT the case) the Rayleigh statistics follow the $\chi^2$ distribution with two degrees of freedom, it is easy to commpute the FAP for the observed periodogram peak.

- Evaluating the number of independent periods (or frequncies) allowed by the input data is more difficult. In order to keep everything simple assume $N_{eff} \sim N/2$.
"""

# ╔═╡ 961dae0a-3bda-4fa3-8197-ca7fafda3ca6
begin
	pmax,pidx = findmax(res)
	printfmtln("Periodogran maximum {:.1f} at period {:.1f} years", pmax, per[pidx])
	lfap = exp(-pmax/2)
	gfap = 1-(1-lfap)^(length(DOevts)/2.)
	printfmtln("Local FAP {:.2f}% and global FAP {:.2f}%", 100*lfap, 100*gfap)
end

# ╔═╡ f184d0dc-5034-45ed-9253-daff25eaa259
md"""
- So that we have that the maximum is for period $P \sim 1500$ years, the local significance (given the assumptions discussed above) is $\sim 99\%$, while the global significance is $\sim 94\%$.

> Too low for a firm claim.
"""

# ╔═╡ bd88c8fa-856b-40d8-9763-942b465d5504
cm"""
- In order to on we need to recap some of the main features of the Poisson distribution

## Poisson Distribution: a brief recap
***

- A Poisson Process is a model for a series of discrete event where the average time between events is known, but the exact timing of events is random. The arrival of an event is independent of the event before (waiting time between events is memoryless).

- A Poisson Process meets the following criteria (in reality many phenomena modeled as Poisson processes don’t meet these exactly):
    - Events are independent of each other. The occurrence of one event does not affect the probability another event will occur.
    - The average rate (events per time period) is constant.
    - Two events cannot occur at the same time.

- The last point means an event either happens or not!

"""

# ╔═╡ 9193b904-7dd0-4f2d-8f17-972eb1f82104
md"""
- A common examples of Poisson processes are customers calling a help center, visitors to a website, radioactive decay in atoms, photons arriving at a space telescope, and movements in a stock price. 

- Poisson processes are generally associated with time, but they do not have to be. In the stock case, we might know the average movements per day (events per time), but we could also have a Poisson process for the number of trees in an acre (events per area).

- One instance frequently given for a Poisson Process is bus arrivals (or trains or Ubers). However, this is not (always) a true Poisson process because the arrivals are not independent of one another. Even for bus systems that do not run on time, whether or not one bus is late affects the arrival time of the next bus.

"""

# ╔═╡ 9f3399e4-d6ba-457b-b5c4-b4acd0238e63
cm"""
- The Poisson Distribution probability mass function gives the probability of observing ``k`` events in a time period given the length of the period and the average events per time:

```math
P(({\rm k\ events\ in\ a\ time\ period}) = e^{-\frac{\rm event}{\rm time} \times {\rm \ time\ period}} \frac{\left( \frac{\rm event}{\rm time} \times {\rm \ time\ period} \right)^k}{k!}
```

- ``{\rm events\ /\ time \times time\ period}`` is usually simplified into a single parameter, ``λ``, the rate parameter:

```math
P({\rm k\ events\ in\ interval}) = e^{-\lambda} \times \frac{\lambda^k}{k!}
```
 
- ``λ`` can be thought of as the *expected number of events in the interval*. 
"""

# ╔═╡ 92ee75bb-dc75-4f50-8e68-d69f8f3ef705
cm"""
### Waiting time
***

- The probability of waiting a given amount of time between successive events decreases exponentially as the time increases. The following equation shows the probability of waiting more than a specified time:

```math
P(T > t) = e^{-\frac{\rm event}{\rm time} \times t}
```

- Conversely, the probability of waiting less than or equal to a time:

```math
P(T \le t) = 1 - e^{-\frac{\rm event}{\rm time} \times t}
```

"""

# ╔═╡ f9bb204d-53d4-4ac1-92dc-4faf5faba96b
md"""
## The Gregory & Loredo algorithm
***

- Let’s divide the time interval $T = t_N − t_1$  into many arbitrarily small steps, $Δt$, so that each interval contains either 1 or 0 detections.

- Given the event rate $r(t)$, then the expectation value for the number of events during $Δt$ is:

```math
\mu(t) = r(t) \Delta t
```

- Poisson statistics tells us that the probability of detecting no events during $Δt$ is: 

```math
p(0) = e^{r(t) \Delta t}
```

- Analogously, the probability of detecting a single event is:

```math
p(1) = r(t) \Delta t\ e^{r(t) \Delta t}
```

- We can now compute the likelihood:

```math
p(D | r,I) = (\Delta t)^N e^{-\int_{(T)} r(t) dt} \prod_{j=1}^N r(t_j)
```

- For simplicity, we assume that the data were collected in a single stretch of time with no gaps. But it is possible to generalize the algorithm.

- With an appropriate $r(t)$, and priors for the model parameters, analysis of arrival time data is no different than any other model selection and parameter estimation problem.

- Instead of fitting a parametrized model, such as a Fourier series, here a nonparametric description of the rate function $r(t)$.

- The shape of the phased light curve is described using a piecewise constant function, $f_j$, with $M$ steps of the same width, and with $\sum_j f_j = 1$.

- The rate is therefore described as: $r(t_j) \equiv r_j = M A f_j$.

    - where $A$ is the average rate, and bin $j$ corresponding to $t_j$, determined from the phase corresponding to $t_j$ and the trial period. 

- The model includes the frequency $ω$ (or period), a phase offset, the average rate $A$, and $M − 1$ parameters $f_j$.

- Marginalizing the resulting pdf one can produce 
    1. an analog of the periodogram, 
    2. expressions for computing the model odds ratio for signal detection, and
    3. for estimating the light curve shape. 

- In the case when little is known about the signal shape, this method is superior to the more popular Fourier series expansion.

"""

# ╔═╡ c7fc2cd6-6715-46d7-a9e0-c6e0bb7b13eb
md"""
## Reference & Material

Material and papers related to the topics discussed in this lecture.

- [Ivezić et al. (2020) - "Statistics, Data Mining, and Machine Learning in Astronomy"](https://ui.adsabs.harvard.edu/abs/2020sdmm.book.....I/abstract)
- [Gregory & Loredo (1992) - "A New Method for the Detection of a Periodic Signal of Unknown Shape and Period”](https://ui.adsabs.harvard.edu/abs/1992ApJ...398..146G/abstract)
"""

# ╔═╡ 7661198f-10bc-4448-be6b-9f4fd1a16f3c
md"""
## Further Material

Papers for examining more closely some of the discussed topics.

- [Ditlevsen et al. (2006) - "The DO-climate events are noise induced: Statistical questioning of the 1470 years cycle"](https://ui.adsabs.harvard.edu/abs/2006AGUFMGC24A..07D/abstract)
- [Weigt al. (2019) - "Observations of Jupiter's auroral emission during Juno apojove 2017"](https://ui.adsabs.harvard.edu/abs/2019EPSC...13.1660W/abstract).
- [Bao & Li (2020) - "Periodic X-ray sources in the Galactic bulge: application of the Gregory-Loredo algorithm"](https://ui.adsabs.harvard.edu/abs/2020MNRAS.498.3513B/abstract)
- [Bao & Li (2021) - "Searching for quasi-periodic oscillations in active galactic nuclei of the Chandra Deep Field South"](https://ui.adsabs.harvard.edu/abs/2022MNRAS.509.3504B/abstract)
"""

# ╔═╡ f1d7038f-f94c-4b71-8f51-b0c03452ad29
md"""
### Credits
***

This notebook contains material obtained from 
"""

# ╔═╡ aeaad396-5e2a-4ec5-8b05-e160e5712f35
cm"""
## Course Flow

<table>
  <tr>
    <td>Previous lecture</td>
    <td>Next lecture</td>
  </tr>
  <tr>
    <td><a href="./open?path=Lectures/Lecture - Wavelet Analysis/Lecture-Wavelet-Analysis.jl">Lecture about wavelet analysis</a></td>
    <td><a href="./open?path=Lectures/Science Case - FRBs/Lecture-FRBs.j">Science case about FRBs</a></td>
  </tr>
 </table>

"""

# ╔═╡ 7aecb81a-a226-4133-9182-53e159f5fc43
md"""
**Copyright**

This notebook is provided as [Open Educational Resource](https://en.wikipedia.org/wiki/Open_educational_resources). Feel free to use the notebook for your own purposes. The text is licensed under [Creative Commons Attribution 4.0](https://creativecommons.org/licenses/by/4.0/), the code of the examples, unless obtained from other properly quoted sources, under the [MIT license](https://opensource.org/licenses/MIT). Please attribute the work as follows: *Stefano Covino, Time Domain Astrophysics - Lecture notes featuring computational examples, 2025*.
"""

# ╔═╡ Cell order:
# ╟─72a2571d-dc92-41d6-b73b-ba9e985ca4f1
# ╟─12200cd2-5e1e-4d96-9284-60fb950fd70a
# ╠═d469c921-1c1b-49b7-bbad-9bfb1e53fbd4
# ╠═ca916ac4-71f2-4b58-878c-3e40cf341b46
# ╠═3c2986d8-2379-4cd5-a747-435a5180076f
# ╟─a81e5916-c0e7-410d-8c6e-917303ab087a
# ╟─17903372-3749-4bfc-9519-7181bc0539f1
# ╟─0e4c318a-511c-4646-9b33-634e5cb87c6d
# ╟─39c8c2cb-e6f6-4d74-8792-ba96e057744a
# ╟─b5dc35e4-0f85-468c-8f59-4ac0b9b00b3e
# ╠═93f9b75c-bd43-4229-9ddd-faba108afe65
# ╟─5deba4b4-03ca-40f6-a6f7-aa17c2dc7eaf
# ╠═93d67ed4-3811-414d-a86f-6e951d5e3dbf
# ╟─14175fbe-5aea-4fe0-b9b1-f99049d9cdde
# ╠═69256ec4-c8b2-4110-896f-ec8bfa01d662
# ╠═3cf771fc-30d2-4ded-b17e-f6b66a35d218
# ╟─3b121627-05b3-4238-9901-441182885848
# ╠═961dae0a-3bda-4fa3-8197-ca7fafda3ca6
# ╟─f184d0dc-5034-45ed-9253-daff25eaa259
# ╟─bd88c8fa-856b-40d8-9763-942b465d5504
# ╟─9193b904-7dd0-4f2d-8f17-972eb1f82104
# ╟─9f3399e4-d6ba-457b-b5c4-b4acd0238e63
# ╟─92ee75bb-dc75-4f50-8e68-d69f8f3ef705
# ╟─f9bb204d-53d4-4ac1-92dc-4faf5faba96b
# ╟─c7fc2cd6-6715-46d7-a9e0-c6e0bb7b13eb
# ╟─7661198f-10bc-4448-be6b-9f4fd1a16f3c
# ╟─f1d7038f-f94c-4b71-8f51-b0c03452ad29
# ╟─aeaad396-5e2a-4ec5-8b05-e160e5712f35
# ╟─7aecb81a-a226-4133-9182-53e159f5fc43
